Multilingual Central Repository version 3.0: upgrading a very large lexical knowledge base

نویسندگان

  • Aitor González Agirre
  • Egoitz Laparra
  • German Rigau
چکیده

This paper describes the upgrading process of the Multilingual Central Repository (MCR). The new MCR uses WordNet 3.0 as Interlingual-Index (ILI). Now, the current version of the MCR integrates in the same EuroWordNet framework wordnets from five different languages: English, Spanish, Catalan, Basque and Galician. In order to provide ontological coherence to all the integrated wordnets, the MCR has also been enriched with a disparate set of ontologies: Base Concepts, Top Ontology, WordNet Domains and Suggested Upper Merged Ontology. We also suggest a novel approach for improving some of the semantic resources integrated in the MCR, including a semiautomatic method to propagate domain information. The whole content of the MCR is freely available.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multilingual Central Repository version 3.0

This paper describes the upgrading process of the Multilingual Central Repository (MCR). The new MCR uses WordNet 3.0 as Interlingual-Index (ILI). Now, the current version of the MCR integrates in the same EuroWordNet framework wordnets from five different languages: English, Spanish, Catalan, Basque and Galician. In order to provide ontological coherence to all the integrated wordnets, the MCR...

متن کامل

Enriching Statistical Translation Models Using a Domain-Independent Multilingual Lexical Knowledge Base

This paper presents a method for improving phrase-based Statistical Machine Translation systems by enriching the original translation model with information derived from a multilingual lexical knowledge base. The method proposed exploits the Multilingual Central Repository (a group of linked WordNets from different languages), as a domain-independent knowledge database, to provide translation m...

متن کامل

The Habanera Lexical Knowledge Base Management System

Habanera is a multipurpose multilingual lexical knowledge base that is developed at CRL to be used as a central repository of multilingual lexical data. The knowledge base contains a set of dictionaries and relations between entries, within a dictionary (e.g., synonymy) as well as between entries of different dictionaries (e.g., translation). The format of monolingual lexical entries is left re...

متن کامل

Starting up the Multilingual Central Repository

Resumen: Este art culo presenta el dise~ no inicial del "Multilingual Central Reposi-tory". La primera versii on del Mcr integra, siguiendo en el marco de EuroWordNet, cinco wordnets locales (incluyendo tres versiones del WordNet de Princeton), la Top Ontology the EuroWordNet, los dominios de MultiWordNet y cientos de miles de nuevas relaciones semm anticas y propiedades adquiridas automm atica...

متن کامل

Methodology and construction of the Basque WordNet

Semantic interpretation of language requires extensive and rich lexical knowledge bases (LKB). The Basque WordNet is a LKB based on WordNet and its multilingual counterparts EuroWordNet and the Multilingual Central Repository. This paper reviews the theoretical and practical aspects of the Basque WordNet lexical knowledge base, as well as the steps and methodology followed in its construction. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011